Identification of Personality-Related Candidate Genes in Thoroughbred Racehorses Using a Bioinformatics-Based Approach Involving Functionally Annotated Human Genes

Simple Summary By using the Thoroughbred genome variant database to identify orthologues to personality-related human genes, we identified 18 potential personality-related genes in horses. These candidate genes have a total of 55 variants that cause amino acid substitutions when compared to the EquCab3.0 reference genome that may impact the function of the proteins encoded by these genes. Moreover, 15 of the 18 genes have not previously been linked to personality in horses, suggesting that this exploratory approach of related genes using human evidence can be useful for equine behavioral genetics. Although using this bioinformatics approach is less useful for investigating genes affecting personality in horses than it is in humans due to a lack of supporting personality research, this study highlights the potential for the identification of candidate genes. If future studies with equine behavioral datasets validate these potential personality–gene associations, this bioinformatics strategy may become important in the field of equine genetics. Abstract Considering the personality traits of racehorses (e.g., flightiness, anxiety, and affability) is considered essential to improve training efficiency and decrease accident frequency, especially when retraining for a second career that may involve contact with inexperienced personnel after retiring from racing. Studies on human personality-related genes are frequently conducted; however, such studies are rare in horses because a consistent methodology for personality evaluation is lacking. Using the recently published whole genome variant database of 101 Thoroughbred horses, we compared horse genes orthologous to human genes related to the Big Five personality traits, and identified 18 personality-related candidate genes in horses. These genes include 55 variants that involve non-synonymous substitutions that highly impact the encoded protein. Moreover, we evaluated the allele frequencies and functional impact on the proteins in terms of the difference in molecular weights and hydrophobicity levels between reference and altered amino acids. We identified 15 newly discovered genes that may affect equine personality, but their associations with personality are still unclear. Although more studies are required to compare genetic and behavioral information to validate this approach, it may be useful under limited conditions for personality evaluation.


Introduction
In the modern world, horses are used in various sports disciplines such as racing, show jumping, and dressage, as well as in various equine-assisted therapeutic and educational programs. Thoroughbred horses are mainly used in racing, but they have limited use after their retirement. In terms of animal welfare, international bodies such as the International Forum for the Aftercare of Racehorses (IFAR) have undertaken efforts to retrain racehorses and develop their post-retirement careers [1][2][3]. Personality and temperament are critical factors when selecting an appropriate individual for a specific purpose [4]. Since the learning ability of horses differs based on their personality and temperament, comprehensive knowledge of these aspects is crucial to improving the success of retraining [5,6]. Notably, no studies have investigated how temperament or personality influences performance or training.
Failures in the post-retirement transition from horseracing often occur in careers that involve greater contact with strangers or inexperienced personnel. One major problem for such horses is increased accident rates with inexperienced riders [7,8]. In other words, appropriate retraining is not conducted to capacitate the horse for amateurs. Therefore, carefully selecting horses for specific careers and setting a retraining direction in accordance with their intended use in their second career is expected to increase the rate of successful retraining.
Recently, twin or family research and whole-genome association studies have revealed human personality and related diseases to be partially controlled by genes [9][10][11]. The personality of horses may similarly be influenced by genetic factors, and additively by environmental factors [12,13]. For example, the equine ASIP genotype that influences coat color is associated with a self-reliant temperament [14], and a single nucleotide polymorphism (SNP) in the variable number of tandem repeats region of the equine DRD4 gene is significantly associated with curiosity and vigilance [15]. Moreover, the oxytocin and the dopaminergic pathways are associated with anxiety or fearfulness in horses, which constitute a temperament called "Neuroticism" in humans; oxytocin is also related to trainability [16,17]. Nine genes have been proposed to be personality-related candidate genes in horses [18]; however, few studies on each personality-related gene have been reported [15,19]. A lack of adequate methods for conducting personality-related research in horses is the primary reason for limited progress in this field when compared to the extensive research conducted in humans [20]. If equine personality genetics advance, a genetic test could be used to help select an ideal second career for retired racehorses, thus enabling more specialized retraining.
In humans, multiple personality-related genes have been identified by genome-wide association studies and the candidate gene approach. Recently, whole genome sequencing of 101 Thoroughbreds was performed, and the whole genome variant database of this population was published [21]. Therefore, we aimed to identify horse genes that are orthologous with human personality-related genes and develop a method for identifying equine personality-related gene candidates by referring to the variants of these genes extracted from the whole genome variant database.

Materials and Methods
In Figure 1, we show the referenced database, inputs and outputs, and our pipeline from methods to results.

Search for Personality-Related Candidate Genes in Horses
Firstly, human personality-related genes were identified using the search term "personality trait" in the National Center for Biotechnology Information (NCBI) gene database [22]. Subsequently, relevant articles dealing with the identified genes were collected; the articles discussed each facet of the Big Five personality traits: Agreeableness, Conscientiousness, Extraversion, Neuroticism, and Openness. These categories have been determined to be heritable in humans [23]. Horse genes orthologous to the identified human genes were searched against the NCBI gene database.

Search for DNA Variants in Equine Personality-Related Gene Candidates
The variants of each gene were sorted using a whole genome variant database of 101 Thoroughbreds that uses EquCab3.0 as a reference genome [21,24]. Variants causing non-synonymous substitution were selected as mutations likely to highly impact protein function. Variant information was checked using the Ensembl Genome Browser and the University of California Santa Cruz (UCSC) Genome Browser [25,26]. Figure 1. Methods pipeline. This shows the databases used in colored boxes, along with the inputs and outputs for each to the right side of each box. The sentences below the outputs show the objective for using each database. Finally, we list the items used to evaluate the importance of individual SNPs and their potential effect on personality.

Characterization of Missense Mutations and Their Effects on Equine Personality
Based on variants' codon mutations, the differences in molecular weights and hydrophobicity index levels of amino acids were manually calculated before and after substitutions. Furthermore, substitutions with a molecular weight difference ≥|40.00| Da and a hydrophobicity level ≥|1.00| were treated as highly significant [27]. Disulfidebond and posttranslational modifications were investigated in regard to their effect on the conformation or function of encoded proteins.

Search for Personality-Related Candidate Genes in Horses
Twenty-eight human personality-related genes were extracted from PubMed by searching the NCBI gene database. The horse orthologues for all 28 genes were found in the horse genome based on gene annotations obtained using Ensembl.
Among the 28 identified genes, ANKK1, APOE, BDNF, CNR1, COMT, DRD4, IL6, and SLC6A4 were associated with two or more personality traits. COMT was specifically associated with all personality traits. Among all the traits explored, Neuroticism was associated with the largest number of genes (23 genes), and Extraversion, Conscientiousness, Agreeableness, and Openness were associated with six, four, three, and four genes, respectively.

Search for DNA Variants in Equine Personality-Related Gene Candidates
After searching all variants of the 28 orthologues in the Thoroughbred variant database, 55 variants causing non-synonymous substitution in 18 different genes were extracted (Table S1). Among the 55 variants, 54 were SNPs that caused a substitution of one nucleotide for another, and one was an insertion that caused the mutation of a nucleotide to a long sequence of nucleotides. These mutations in horses were not found to be in corresponding locations in the human genome, as checked using the Ensembl and UCSC genome browsers.
SNPs in the coding region of ANKK1, which is associated with Neuroticism and Extraversion, were the most frequent (13) among the identified variants. In contrast, nine out of the ten genes excluded from the candidate search, which have no variants with non-synonymous substitutions, were associated with Neuroticism. Out of the 18 identified genes, in 15 the substitutions were principally related to major signaling pathways in the central nervous system, such as the intracellular signaling pathway, neurotransmission pathway, the hypothalamic-pituitary-adrenal axis, and intercellular connections. Additionally, APOE and BDNF, the neurotrophic-related factors related to the accumulation of amyloid beta in the brain, and PER3, which is related to the regulation of circadian rhythm, were identified as personality-related genes ( Table 1).
Codon insertion was not registered for the variant located at chrX:36800238 on MAOA in either the Ensembl or the UCSC genome database; however, codon deletion was recorded at the same location in the European Variation Achieve database. Moreover, allele frequency data for MAOA, which is on a sex chromosome, were not published in the Thoroughbred variant database.  Variants involving missense mutations were summarized as Human Genome Variant Society protein nomenclature (HGVSp) based on their amino acid location and substitution. Differences in molecular weight and hydrophobicity index owing to substitutions were analyzed for each variant (Table 3). Eight variants registered a molecular weight difference of ≥|40.00| Da, and fifteen variants resulted in ≥|1.00| deviation in the hydrophobicity index. Moreover, two HGVSp were identified for the variants at chr3:31382147 and chr3:31845727 on CDH13. The codon locations vary in each splice variant.

Discussion
In this study, 28 human personality-related genes and their orthologous genes in horses were identified. Among them, 18 genes were detected that have 55 DNA polymorphisms annotated as non-synonymous substitutions in the whole genome variant database of a population of 101 Thoroughbreds. Missense mutations potentially influence the structure of proteins, thereby altering their functions. Therefore, these 18 genes were putative candidate genes affecting personality diversity in horses. Moreover, among these genes, BDNF, HTR2A, and MAOA were reported as candidates principally associated with personality in horses [13]. Collectively, candidate genes can be identified using this bioinformatics method.
Amino acid substitution significantly alters molecular weight, alters the primary and secondary structure, and influences the functional characteristics of proteins [28]. Eight variants were detected that cause significant changes in the molecular weight of amino acids encoded by seven genes, namely, ANKK1, APOE, CDH13, DGKH, FAAH, HSD11B1, and MAOA. Accordingly, these variants could influence the protein-protein interaction network and, therefore, pathways associated with personality.
Altered hydrophobicity levels can disrupt the configuration of amino acids, resulting in changes in protein conformation and native folding [29]. Fifteen variants of seven genes-ANKK1, APOE, CDH13, DRD2, FAAH, GABRA6, and MAOA-were identified, which significantly impacted the hydrophobicity of the resulting amino acids. Therefore, the protein transportation and intercellular signaling pathways regulated by these genes may be affected by these amino acid alterations.
Gain or loss of an amino acid that participates in posttranslational modifications can alter the activity or localization of the protein, thereby affecting horse personality diversity. The variant at chrX:36799409 in MAOA removes cysteine, which is susceptible to S-nitrosylation; therefore, the enzyme activity and expression level of monoamine oxidase A (MAOA) may also be affected [30]. Variants causing gain or loss of asparagine, which is susceptible to N-glycosylation, were detected in genes encoding MAOA, ankyrin repeat and kinase domain containing I protein (ANKK1), and period circadian protein homolog 3 protein (PER3) [31]. Changes in folded protein structure influence the enzymatic activity of MAOA and ANKK1 and the efficiency of transportation and signal transduction of PER3. Additionally, serine, tyrosine, and threonine are frequently subjected to phosphorylation, which plays a role in cell signal switching [32]; variants causing gain or loss of these amino acids were detected in MAOA, ANKK1, PER3, HSD11B1, NPY, CDH13, DGKH, GABRA6, P2RX7, COMT, HTR2A, and LEP. Hence, amino-acid-specific posttranslational modification influences protein function and may also likely contribute to horse personality diversity.
Notably, many of the identified genes were related to Neuroticism. This personality trait indicates a neuropathological characteristic [33] and has been linked to a depressionrelated phenotype [34]. Moreover, ANKK1 and PER3 were associated with Extraversion, NPY was associated with Conscientiousness, CDH13 and CNR1 were associated with Agreeableness, and DGKH was associated with Openness; each of these genes was also associated with Neuroticism, suggesting that they mediate a trade-off or compatibility between the two personality traits.
Among the 55 variants, the fact that 37.50% of the SNPs represented 5% or less MAF may not relate to the universal diversity of personality in the Thoroughbred population because allele frequency of variants with low harmfulness to an organism tends to increase via genetic drift [35]. Accordingly, the widely expanded SNPs (MAF: 0.3-0.5) may be more useful for comprehending the universal diversity of Thoroughbred personality.
A substitution from cysteine to arginine at chrX:36799409 on MAOA may result in the loss of a disulfide bridge, which is the primary covalent protein bond. Consequently, this substitution may exacerbate the Neuroticism trait associated with MAOA since it causes the protein conformation to become unstable [36,37]. Changes at loci including chr3:31845727 on CDH13, chr7:22331004 on ANKK1, chr2:12174492 on FAAH, and chrX:36799409 on MAOA, can cause changes in both the configuration of amino acids and the conformation of coded proteins, and may thus functionally impact molecular weights and hydrophobicity levels. In particular, SNPs at chr3:31845727 and chrX:36799409 may cause alterations in protein activity due to the loss of serine and cysteine, respectively. These SNPs can thus potentially increase or decrease the output intensities of the corresponding personalities.
Although personality has been reported to vary between horse breeds, this can likely be attributed to environmental differences that are involved in managing horses and the variation of variant distribution in the population due to selective pressures on each breed [38]. Therefore, variants in personality-related candidate genes are assumed to be at the same loci and serve the same function in other breeds as in Thoroughbreds.
An escape response that occurs due to a frightening stimulus can lead to injuries when riding a horse; however, desensitization training can reduce this response [39,40]. Selectively retraining horses with low Neuroticism can improve the efficiency of converting racehorses for general horse riding. MAOA, which is related to Neuroticism in humans, may be responsible for the difference in escape response that exists between sexes [41]. Furthermore, horses' reactivity to humans is considered to be related to the Openness trait, with a facet of curiosity [42]. Openness should be evaluated when retraining horses for use in careers that involve being touched by inexperienced and passive personnel because horses with high Openness may tend to demonstrate behaviors that have the potential to harm humans, such as biting hands or pulling clothes.

Conclusions
We developed a novel bioinformatics approach to identify equine personality-related candidate genes using related knowledge of human genes and the genome variant database of 101 Thoroughbreds. This strategy could support research progress in using genetics to predict equine personality, and it is expected to trigger investigations into the associated protein functions. Moreover, this findings may aid in expanding the knowledge of the possibility of selecting a second career for Thoroughbred horses based on genetics. This research can thus be used to improve the health and welfare of horses. As of now, this bioinformatics approach should be treated as preliminary since these gene-personality asso-ciations in horses have not been confirmed. Therefore, more in vivo studies and functional analyses are needed to demonstrate the accuracy and strength of these associations.